598 research outputs found

    Unsupervised Summarization by Jointly Extracting Sentences and Keywords

    Full text link
    We present RepRank, an unsupervised graph-based ranking model for extractive multi-document summarization in which the similarity between words, sentences, and word-to-sentence can be estimated by the distances between their vector representations in a unified vector space. In order to obtain desirable representations, we propose a self-attention based learning method that represent a sentence by the weighted sum of its word embeddings, and the weights are concentrated to those words hopefully better reflecting the content of a document. We show that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using our learned representations, and prove that this process always converges to a unique solution leading to improvement in performance. A variant of absorbing random walk and the corresponding sampling-based algorithm are also described to avoid redundancy and increase diversity in the summaries. Experiment results with multiple benchmark datasets show that RepRank achieved the best or comparable performance in ROUGE.Comment: 10 pages(includes 2 pages references), 1 figur

    SparseGAN: Sparse Generative Adversarial Network for Text Generation

    Full text link
    It is still a challenging task to learn a neural text generation model under the framework of generative adversarial networks (GANs) since the entire training process is not differentiable. The existing training strategies either suffer from unreliable gradient estimations or imprecise sentence representations. Inspired by the principle of sparse coding, we propose a SparseGAN that generates semantic-interpretable, but sparse sentence representations as inputs to the discriminator. The key idea is that we treat an embedding matrix as an over-complete dictionary, and use a linear combination of very few selected word embeddings to approximate the output feature representation of the generator at each time step. With such semantic-rich representations, we not only reduce unnecessary noises for efficient adversarial training, but also make the entire training process fully differentiable. Experiments on multiple text generation datasets yield performance improvements, especially in sequence-level metrics, such as BLEU

    Improving Coreference Resolution by Leveraging Entity-Centric Features with Graph Neural Networks and Second-order Inference

    Full text link
    One of the major challenges in coreference resolution is how to make use of entity-level features defined over clusters of mentions rather than mention pairs. However, coreferent mentions usually spread far apart in an entire text, which makes it extremely difficult to incorporate entity-level features. We propose a graph neural network-based coreference resolution method that can capture the entity-centric information by encouraging the sharing of features across all mentions that probably refer to the same real-world entity. Mentions are linked to each other via the edges modeling how likely two linked mentions point to the same entity. Modeling by such graphs, the features between mentions can be shared by message passing operations in an entity-centric manner. A global inference algorithm up to second-order features is also presented to optimally cluster mentions into consistent groups. Experimental results show our graph neural network-based method combing with the second-order decoding algorithm (named GNNCR) achieved close to state-of-the-art performance on the English CoNLL-2012 Shared Task dataset

    Self-tuning fuzzy controller for air-conditioning systems

    Get PDF
    Master'sMASTER OF ENGINEERIN

    SPARQL Query Mediation for Data Integration

    Get PDF
    The Semantic Web provides a set of promising technologies to make sophisticated data integration much easier, because data on the semantic Web is allowed to be connected by links and complex queries can be executed against the dataset of those linked data. Although the Semantic Web techniques offer RDF/OWL to support schematic mappings between diverse data sources, large-scale data integration is still severely hampered by various types of data-level semantic heterogeneity among the data sources. In the paper, we show that SPARQL queries that are intended to execute over multiple heterogeneous data sources can be mediated automatically
    corecore